105 research outputs found
Bayesian Compression for Deep Learning
Compression and computational efficiency in deep learning have become a
problem of great significance. In this work, we argue that the most principled
and effective way to attack this problem is by adopting a Bayesian point of
view, where through sparsity inducing priors we prune large parts of the
network. We introduce two novelties in this paper: 1) we use hierarchical
priors to prune nodes instead of individual weights, and 2) we use the
posterior uncertainties to determine the optimal fixed point precision to
encode the weights. Both factors significantly contribute to achieving the
state of the art in terms of compression rates, while still staying competitive
with methods designed to optimize for speed or energy efficiency.Comment: Published as a conference paper at NIPS 201
Optical Music Recognition with Convolutional Sequence-to-Sequence Models
Optical Music Recognition (OMR) is an important technology within Music
Information Retrieval. Deep learning models show promising results on OMR
tasks, but symbol-level annotated data sets of sufficient size to train such
models are not available and difficult to develop. We present a deep learning
architecture called a Convolutional Sequence-to-Sequence model to both move
towards an end-to-end trainable OMR pipeline, and apply a learning process that
trains on full sentences of sheet music instead of individually labeled
symbols. The model is trained and evaluated on a human generated data set, with
various image augmentations based on real-world scenarios. This data set is the
first publicly available set in OMR research with sufficient size to train and
evaluate deep learning models. With the introduced augmentations a pitch
recognition accuracy of 81% and a duration accuracy of 94% is achieved,
resulting in a note level accuracy of 80%. Finally, the model is compared to
commercially available methods, showing a large improvements over these
applications.Comment: ISMIR 201
Improved Bayesian Compression
Compression of Neural Networks (NN) has become a highly studied topic in
recent years. The main reason for this is the demand for industrial scale usage
of NNs such as deploying them on mobile devices, storing them efficiently,
transmitting them via band-limited channels and most importantly doing
inference at scale. In this work, we propose to join the Soft-Weight Sharing
and Variational Dropout approaches that show strong results to define a new
state-of-the-art in terms of model compression
An optimal control perspective on diffusion-based generative modeling
We establish a connection between stochastic optimal control and generative
models based on stochastic differential equations (SDEs) such as recently
developed diffusion probabilistic models. In particular, we derive a
Hamilton-Jacobi-Bellman equation that governs the evolution of the
log-densities of the underlying SDE marginals. This perspective allows to
transfer methods from optimal control theory to generative modeling. First, we
show that the evidence lower bound is a direct consequence of the well-known
verification theorem from control theory. Further, we develop a novel
diffusion-based method for sampling from unnormalized densities -- a problem
frequently occurring in statistics and computational sciences.Comment: Accepted for oral presentation at NeurIPS 2022 Workshop on
Score-Based Method
Impact of policy networks in the GATT Uruguay Round: The case of the US-EC agricultural negotiations.
This thesis investigates the membership, activities and policy impact of three distinct groups of policy networks operating within and between the agricultural policy environments of the US and EC as well as at the multilateral level during the preparation for and negotiations of the GATT Uruguay Round between 1980 and 1993. Briefly defined, these three groups are: 1) epistemic communities - networks of professionals who share both specialized knowledge and expertise in a specific issue area; 2) advocacy coalitions - policy actors from various levels of the policy process who share common policy beliefs and work together to turn these policy beliefs into government policy; and 3) elite transnational networks - incorporating political leaders, political appointees and senior government and international institutional officials, these elite level networks are formed through regular contact in either an official or unofficial capacity. The contention of this thesis is that various networks of actors within the distinct policy networks of epistemic communities, advocacy coalitions and elite transnational networks contributed significantly to bringing about the reform of agricultural policy that occurred within the EC and the US between 1980 and 1993 allowing for the establishment of consensus on the liberalization of agricultural trade policy at the multilateral level of the General Agreement on Tariffs and Trade during the Uruguay Round. The hypothesis of this thesis is that these three policy networks varied in their impact according to the specific stage of negotiations due to changing policy needs. I argue that in general: 1) epistemic communities exhibited the most impact during the agenda-setting stage owing in part to their expertise in agricultural trade issues, the existence of a common framework for discussion and their work in creating analytical tools that allowed agricultural liberalization to be politically and economically viable; 2) advocacy coalitions had the most significant role during the second, or policy-making stage, due to their ability to work within the policy environment and shape domestic policy development; and 3) elite transnational networks, due to their ability to provide the necessary political pressure, had the greatest impact in the third, or breakthrough stage
Latent Discretization for Continuous-time Sequence Compression
Neural compression offers a domain-agnostic approach to creating codecs for
lossy or lossless compression via deep generative models. For sequence
compression, however, most deep sequence models have costs that scale with the
sequence length rather than the sequence complexity. In this work, we instead
treat data sequences as observations from an underlying continuous-time process
and learn how to efficiently discretize while retaining information about the
full sequence. As a consequence of decoupling sequential information from its
temporal discretization, our approach allows for greater compression rates and
smaller computational complexity. Moreover, the continuous-time approach
naturally allows us to decode at different time intervals. We empirically
verify our approach on multiple domains involving compression of video and
motion capture sequences, showing that our approaches can automatically achieve
reductions in bit rates by learning how to discretize
Measuring IPDE-SQ personality disorder prevalence in pre-sentence and early-stage prison populations, with sub-type estimates
Understanding the prevalence and type of personality disorder within prison systems allows for the effective targeting of resources to implement strategies to alleviate symptoms, manage behaviour and attempt to reduce re-offending. This study aimed to determine the prevalence of personality disorder (PD) traits within a local urban high-turnover adult male prison with a remand/recently sentenced population in London, UK. The International Personality Disorder Examination–Screening Questionnaire (IPDE-SQ) self-administered questionnaire (ICD-10 version) was completed by 283 prisoners (42% completion rate). 77% of respondents reached the threshold for one or more PDs. The most common PD types were Paranoid PD (44.5%), Anankastic PD (40.3%), Schizoid PD (35%) and Dissocial PD (25.8%). These results confirm and extend existing knowledge regarding the prevalence of PD in prison populations into a high-turnover, urban, remand population. The stark comparison with community samples indicates that a more equitable standard of service delivery within the criminal justice system, focussing on preventive and early intervention services, is now required
- …